62 research outputs found
Recommended from our members
The UDP-Glycosyltransferase Family in Drosophila melanogaster: Nomenclature Update, Gene Expression and Phylogenetic Analysis.
UDP-glycosyltransferases (UGTs) are important conjugation enzymes found in all kingdoms of life, catalyzing a sugar conjugation with small lipophilic compounds and playing a crucial role in detoxification and homeostasis. The UGT gene family is defined by a signature motif in the C-terminal domain where the uridine diphosphate (UDP)-sugar donor binds. UGTs have been identified in a number of insect genomes over the last decade and much progress has been achieved in characterizing their expression patterns and molecular functions. Here, we present an update of the complete repertoire of UGT genes in Drosophila melanogaster and provide a brief overview of the latest research in this model insect. A total of 35 UGT genes are found in the D. melanogaster genome, localized to chromosomes 2 and 3 with a high degree of gene duplications on the chromosome arm 3R. All D. melanogaster UGT genes have now been named in FlyBase according to the unified UGT nomenclature guidelines. A phylogenetic analysis of UGT genes shows lineage-specific gene duplications. Analysis of anatomical and induced gene expression patterns demonstrate that some UGT genes are differentially expressed in various tissues or after environmental treatments. Extended searches of UGT orthologs from 18 additional Drosophila species reveal a diversity of UGT gene numbers and composition. The roles of Drosophila UGTs identified to date are briefly reviewed, and include xenobiotic metabolism, nicotine resistance, olfaction, cold tolerance, sclerotization, pigmentation, and immunity. Together, the updated genomic information and research overview provided herein will aid further research in this developing field
The aminoacyl-tRNA synthetases of Drosophila melanogaster.
Aminoacyl-tRNA synthetases (aaRSs) ligate amino acids to their cognate tRNAs, allowing them to decode the triplet code during translation. Through different mechanisms aaRSs also perform several non-canonical functions in transcription, translation, apoptosis, angiogenesis and inflammation. Drosophila has become a preferred system to model human diseases caused by mutations in aaRS genes, to dissect effects of reduced translation or non-canonical activities, and to study aminoacylation and translational fidelity. However, the lack of a systematic annotation of this gene family has hampered such studies. Here, we report the identification of the entire set of aaRS genes in the fly genome and we predict their roles based on experimental evidence and/or orthology. Further, we propose a new, systematic and logical nomenclature for aaRSs. We also review the research conducted on Drosophila aaRSs to date. Together, our work provides the foundation for further research in the fly aaRS field.S.J.M. is supported by the FlyBase NIH/NHGRI grant U41HG000739 (W.M. Gelbart, Harvard University, PI; N.H. Brown, University of Cambridge, coPI). J.L. was supported by Cancer Research Switzerland and a SNF grant to B.S., and by a SNSF Early Postdoc Fellowship.This is the final version of the article. It was first available from Taylor & Francis via http://dx.doi.org/10.1080/19336934.2015.110119
Recommended from our members
Towards comprehensive annotation of Drosophila melanogaster enzymes in FlyBase.
The catalytic activities of enzymes can be described using Gene Ontology (GO) terms and Enzyme Commission (EC) numbers. These annotations are available from numerous biological databases and are routinely accessed by researchers and bioinformaticians to direct their work. However, enzyme data may not be congruent between different resources, while the origin, quality and genomic coverage of these data within any one resource are often unclear. GO/EC annotations are assigned either manually by expert curators or inferred computationally, and there is potential for errors in both types of annotation. If such errors remain unchecked, false positive annotations may be propagated across multiple resources, significantly degrading the quality and usefulness of these data. Similarly, the absence of annotations (false negatives) from any one resource can lead to incorrect inferences or conclusions. We are systematically reviewing and enhancing the functional annotation of the enzymes of Drosophila melanogaster, focusing on improvements within the FlyBase (www.flybase.org) database. We have reviewed four major enzyme groups to date: oxidoreductases, lyases, isomerases and ligases. Herein, we describe our review workflow, the improvement in the quality and coverage of enzyme annotations within FlyBase and the wider impact of our work on other related databases
Recommended from our members
Building a pipeline to solicit expert knowledge from the community to aid gene summary curation.
Brief summaries describing the function of each gene's product(s) are of great value to the research community, especially when interpreting genome-wide studies that reveal changes to hundreds of genes. However, manually writing such summaries, even for a single species, is a daunting task; for example, the Drosophila melanogaster genome contains almost 14 000 protein-coding genes. One solution is to use computational methods to generate summaries, but this often fails to capture the key functions or express them eloquently. Here, we describe how we solicited help from the research community to generate manually written summaries of D. melanogaster gene function. Based on the data within the FlyBase database, we developed a computational pipeline to identify researchers who have worked extensively on each gene. We e-mailed these researchers to ask them to draft a brief summary of the main function(s) of the gene's product, which we edited for consistency to produce a 'gene snapshot'. This approach yielded 1800 gene snapshot submissions within a 3-month period. We discuss the general utility of this strategy for other databases that capture data from the research literature. Database URL: https://flybase.org/
Identification and bioinformatic analysis of neprilysin and neprilysin-like metalloendopeptidases in Drosophila melanogaster.
The neprilysin (M13) family of metalloendopeptidases comprises highly conserved ectoenzymes that cleave and thereby inactivate many physiologically relevant peptides in the extracellular space. Impaired neprilysin activity is associated with numerous human diseases. Here, we present a comprehensive list and classification of M13 family members in Drosophila melanogaster. Seven Neprilysin (Nep) genes encode active peptidases, while 21 Neprilysin-like (Nepl) genes encode proteins predicted to be catalytically inactive. RNAseq data demonstrate that all 28 genes are expressed during development, often in a tissue-specific pattern. Most Nep proteins possess a transmembrane domain, whereas almost all Nepl proteins are predicted to be secreted
Automatic categorization of diverse experimental information in the bioscience literature
Background:
Curation of information from bioscience literature into biological knowledge databases is a crucial way of capturing experimental information in a computable form. During the biocuration process, a critical first step is to identify from all published literature the papers that contain results for a specific data type the curator is interested in annotating. This step normally requires curators to manually examine many papers to ascertain which few contain information of interest and thus, is usually time consuming. We developed an automatic method for identifying papers containing these curation data types among a large pool of published scientific papers based on the machine learning method Support Vector Machine (SVM). This classification system is completely automatic and can be readily applied to diverse experimental data types. It has been in use in production for automatic categorization of 10 different experimental datatypes in the biocuration process at WormBase for the past two years and it is in the process of being adopted in the biocuration process at FlyBase and the Saccharomyces Genome Database (SGD). We anticipate that this method can be readily adopted by various databases in the biocuration community and thereby greatly reducing time spent on an otherwise laborious and demanding task. We also developed a simple, readily automated procedure to utilize training papers of similar data types from different bodies of literature such as C. elegans and D. melanogaster to identify papers with any of these data types for a single database. This approach has great significance because for some data types, especially those of low occurrence, a single corpus often does not have enough training papers to achieve satisfactory performance.
Results:
We successfully tested the method on ten data types from WormBase, fifteen data types from FlyBase and three data types from Mouse Genomics Informatics (MGI). It is being used in the curation work flow at WormBase for automatic association of newly published papers with ten data types including RNAi, antibody, phenotype, gene regulation, mutant allele sequence, gene expression, gene product interaction, overexpression phenotype, gene interaction, and gene structure correction.
Conclusions:
Our methods are applicable to a variety of data types with training set containing several hundreds to a few thousand documents. It is completely automatic and, thus can be readily incorporated to different workflow at different literature-based databases. We believe that the work presented here can contribute greatly to the tremendous task of automating the important yet labor-intensive biocuration effort
The DNA polymerases of Drosophila melanogaster.
DNA synthesis during replication or repair is a fundamental cellular process that is catalyzed by a set of evolutionary conserved polymerases. Despite a large body of research, the DNA polymerases of Drosophila melanogaster have not yet been systematically reviewed, leading to inconsistencies in their nomenclature, shortcomings in their functional (Gene Ontology, GO) annotations and an under-appreciation of the extent of their characterization. Here, we describe the complete set of DNA polymerases in D. melanogaster, applying nomenclature already in widespread use in other species, and improving their functional annotation. A total of 19 genes encode the proteins comprising three replicative polymerases (alpha-primase, delta, epsilon), five translesion/repair polymerases (zeta, eta, iota, Rev1, theta) and the mitochondrial polymerase (gamma). We also provide an overview of the biochemical and genetic characterization of these factors in D. melanogaster. This work, together with the incorporation of the improved nomenclature and GO annotation into key biological databases, including FlyBase and UniProtKB, will greatly facilitate access to information about these important proteins
The Drosophila phenotype ontology
BACKGROUND: Phenotype ontologies are queryable classifications of phenotypes. They provide a widely-used means for annotating phenotypes in a form that is human-readable, programatically accessible and that can be used to group annotations in biologically meaningful ways. Accurate manual annotation requires clear textual definitions for terms. Accurate grouping and fruitful programatic usage require high-quality formal definitions that can be used to automate classification. The Drosophila phenotype ontology (DPO) has been used to annotate over 159,000 phenotypes in FlyBase to date, but until recently lacked textual or formal definitions. RESULTS: We have composed textual definitions for all DPO terms and formal definitions for 77% of them. Formal definitions reference terms from a range of widely-used ontologies including the Phenotype and Trait Ontology (PATO), the Gene Ontology (GO) and the Cell Ontology (CL). We also describe a generally applicable system, devised for the DPO, for recording and reasoning about the timing of death in populations. As a result of the new formalisations, 85% of classifications in the DPO are now inferred rather than asserted, with much of this classification leveraging the structure of the GO. This work has significantly improved the accuracy and completeness of classification and made further development of the DPO more sustainable. CONCLUSIONS: The DPO provides a set of well-defined terms for annotating Drosophila phenotypes and for grouping and querying the resulting annotation sets in biologically meaningful ways. Such queries have already resulted in successful function predictions from phenotype annotation. Moreover, such formalisations make extended queries possible, including cross-species queries via the external ontologies used in formal definitions. The DPO is openly available under an open source license in both OBO and OWL formats. There is good potential for it to be used more broadly by the Drosophila community, which may ultimately result in its extension to cover a broader range of phenotypes
Exploring FlyBase Data Using QuickSearch.
FlyBase (flybase.org) is the primary online database of genetic, genomic, and functional information about Drosophila species, with a major focus on the model organism Drosophila melanogaster. The long and rich history of Drosophila research, combined with recent surges in genomic-scale and high-throughput technologies, mean that FlyBase now houses a huge quantity of data. Researchers need to be able to rapidly and intuitively query these data, and the QuickSearch tool has been designed to meet these needs. This tool is conveniently located on the FlyBase homepage and is organized into a series of simple tabbed interfaces that cover the major data and annotation classes within the database. This unit describes the functionality of all aspects of the QuickSearch tool. With this knowledge, FlyBase users will be equipped to take full advantage of all QuickSearch features and thereby gain improved access to data relevant to their research. © 2016 by John Wiley & Sons, Inc
Recommended from our members
The ribosomal protein genes and Minute loci of Drosophila melanogaster.
BACKGROUND: Mutations in genes encoding ribosomal proteins (RPs) have been shown to cause an array of cellular and developmental defects in a variety of organisms. In Drosophila melanogaster, disruption of RP genes can result in the 'Minute' syndrome of dominant, haploinsufficient phenotypes, which include prolonged development, short and thin bristles, and poor fertility and viability. While more than 50 Minute loci have been defined genetically, only 15 have so far been characterized molecularly and shown to correspond to RP genes. RESULTS: We combined bioinformatic and genetic approaches to conduct a systematic analysis of the relationship between RP genes and Minute loci. First, we identified 88 genes encoding 79 different cytoplasmic RPs (CRPs) and 75 genes encoding distinct mitochondrial RPs (MRPs). Interestingly, nine CRP genes are present as duplicates and, while all appear to be functional, one member of each gene pair has relatively limited expression. Next, we defined 65 discrete Minute loci by genetic criteria. Of these, 64 correspond to, or very likely correspond to, CRP genes; the single non-CRP-encoding Minute gene encodes a translation initiation factor subunit. Significantly, MRP genes and more than 20 CRP genes do not correspond to Minute loci. CONCLUSION: This work answers a longstanding question about the molecular nature of Minute loci and suggests that Minute phenotypes arise from suboptimal protein synthesis resulting from reduced levels of cytoribosomes. Furthermore, by identifying the majority of haplolethal and haplosterile loci at the molecular level, our data will directly benefit efforts to attain complete deletion coverage of the D. melanogaster genome.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are
- …